789 research outputs found

    Monte Carlo methods for the valuation of multiple exercise options

    Get PDF
    We discuss Monte Carlo methods for valuing options with multiple exercise features in discrete time. By extending the recently developed duality ideas for American option pricing we show how to obtain estimates on the prices of such options using Monte Carlo techniques. We prove convergence of our approach and estimate the error. The methods are applied to options in the energy and interest rate derivative markets

    Random intersection trees

    Get PDF
    Finding interactions between variables in large and high-dimensional datasets is often a serious computational challenge. Most approaches build up interaction sets incrementally, adding variables in a greedy fashion. The drawback is that potentially informative high-order interactions may be overlooked. Here, we propose at an alternative approach for classification problems with binary predictor variables, called Random Intersection Trees. It works by starting with a maximal interaction that includes all variables, and then gradually removing variables if they fail to appear in randomly chosen observations of a class of interest. We show that informative interactions are retained with high probability, and the computational complexity of our procedure is of order pÎșp^\kappa for a value of Îș\kappa that can reach values as low as 1 for very sparse data; in many more general settings, it will still beat the exponent ss obtained when using a brute force search constrained to order ss interactions. In addition, by using some new ideas based on min-wise hash schemes, we are able to further reduce the computational cost. Interactions found by our algorithm can be used for predictive modelling in various forms, but they are also often of interest in their own right as useful characterisations of what distinguishes a certain class from others.This is the author's accepted manuscript. The final version of the manuscript can be found in the Journal of Machine Learning Research here: jmlr.csail.mit.edu/papers/volume15/shah14a/shah14a.pdf

    Discussion: A tale of three cousins: Lasso, L2Boosting and Dantzig

    Full text link
    Discussion of ``The Dantzig selector: Statistical estimation when pp is much larger than nn'' by Emmanuel Candes and Terence Tao [math/0506081]Comment: Published in at http://dx.doi.org/10.1214/009053607000000460 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The xyz algorithm for fast interaction search in high-dimensional data

    Get PDF
    When performing regression on a data set with p variables, it is often of interest to go beyond using main linear effects and include interactions as products between individual variables. For small-scale problems, these interactions can be computed explicitly but this leads to a computational complexity of at least O(p2) if done naively. This cost can be prohibitive if p is very large. We introduce a new randomised algorithm that is able to discover interactions with high probability and under mild conditions has a runtime that is subquadratic in p. We show that strong interactions can be discovered in almost linear time, whilst finding weaker interactions requires O(pα) operations for 1 < α < 2 depending on their strength. The underlying idea is to transform interaction search into a closest pair problem which can be solved efficiently in subquadratic time. The algorithm is called xyz and is implemented in the language R. We demonstrate its efficiency for application to genome-wide association studies, where more than 1011 interactions can be screened in under 280 seconds with a single-core 1:2 GHz CPU.Isaac Newton Trust Early Career Support Schem

    Right singular vector projection graphs: fast high dimensional covariance matrix estimation under latent confounding

    Get PDF
    In this work we consider the problem of estimating a high-dimensional p×pp \times p covariance matrix Σ\Sigma, given nn observations of confounded data with covariance Σ+ΓΓT\Sigma + \Gamma \Gamma^T, where Γ\Gamma is an unknown p×qp \times q matrix of latent factor loadings. We propose a simple and scalable estimator based on the projection on to the right singular vectors of the observed data matrix, which we call RSVP. Our theoretical analysis of this method reveals that in contrast to PCA-based approaches, RSVP is able to cope well with settings where the smallest eigenvalue of ΓTΓ\Gamma^T \Gamma is close to the largest eigenvalue of Σ\Sigma, as well as settings where the eigenvalues of ΓTΓ\Gamma^T \Gamma are diverging fast. It is also able to handle data that may have heavy tails and only requires that the data has an elliptical distribution. RSVP does not require knowledge or estimation of the number of latent factors qq, but only recovers Σ\Sigma up to an unknown positive scale factor. We argue this suffices in many applications, for example if an estimate of the correlation matrix is desired. We also show that by using subsampling, we can further improve the performance of the method. We demonstrate the favourable performance of RSVP through simulation experiments and an analysis of gene expression datasets collated by the GTEX consortium.Supported by an EPSRC First Grant and the Alan Turing Institute under the EPSRC grant EP/N510129/1

    Analysis of the Copenhagen Accord pledges and its global climatic impacts‚ a snapshot of dissonant ambitions

    Get PDF
    This analysis of the Copenhagen Accord evaluates emission reduction pledges by individual countries against the Accord's climate-related objectives. Probabilistic estimates of the climatic consequences for a set of resulting multi-gas scenarios over the 21st century are calculated with a reduced complexity climate model, yielding global temperature increase and atmospheric CO2 and CO2-equivalent concentrations. Provisions for banked surplus emission allowances and credits from land use, land-use change and forestry are assessed and are shown to have the potential to lead to significant deterioration of the ambition levels implied by the pledges in 2020. This analysis demonstrates that the Copenhagen Accord and the pledges made under it represent a set of dissonant ambitions. The ambition level of the current pledges for 2020 and the lack of commonly agreed goals for 2050 place in peril the Accord's own ambition: to limit global warming to below 2 °C, and even more so for 1.5 °C, which is referenced in the Accord in association with potentially strengthening the long-term temperature goal in 2015. Due to the limited level of ambition by 2020, the ability to limit emissions afterwards to pathways consistent with either the 2 or 1.5 °C goal is likely to become less feasibl

    A roadmap for rapid decarbonization

    Get PDF
    Although the Paris Agreement's goals (1) are aligned with science (2) and can, in principle, be technically and economically achieved (3), alarming inconsistencies remain between science-based targets and national commitments. Despite progress during the 2016 Marrakech climate negotiations, long-term goals can be trumped by political short-termism. Following the Agreement, which became international law earlier than expected, several countries published mid-century decarbonization strategies, with more due soon. Model-based decarbonization assessments (4) and scenarios often struggle to capture transformative change and the dynamics associated with it: disruption, innovation, and nonlinear change in human behavior. For example, in just 2 years, China's coal use swung from 3.7% growth in 2013 to a decline of 3.7% in 2015 (5). To harness these dynamics and to calibrate for short-term realpolitik, we propose framing the decarbonization challenge in terms of a global decadal roadmap based on a simple heuristic—a “carbon law”—of halving gross anthropogenic carbon-dioxide (CO2) emissions every decade. Complemented by immediately instigated, scalable carbon removal and efforts to ramp down land-use CO2 emissions, this can lead to net-zero emissions around mid-century, a path necessary to limit warming to well below 2°C

    A Human Development Framework for CO2 Reductions

    Get PDF
    Although developing countries are called to participate in CO2 emission reduction efforts to avoid dangerous climate change, the implications of proposed reduction schemes in human development standards of developing countries remain a matter of debate. We show the existence of a positive and time-dependent correlation between the Human Development Index (HDI) and per capita CO2 emissions from fossil fuel combustion. Employing this empirical relation, extrapolating the HDI, and using three population scenarios, the cumulative CO2 emissions necessary for developing countries to achieve particular HDI thresholds are assessed following a Development As Usual approach (DAU). If current demographic and development trends are maintained, we estimate that by 2050 around 85% of the world's population will live in countries with high HDI (above 0.8). In particular, 300Gt of cumulative CO2 emissions between 2000 and 2050 are estimated to be necessary for the development of 104 developing countries in the year 2000. This value represents between 20% to 30% of previously calculated CO2 budgets limiting global warming to 2{\deg}C. These constraints and results are incorporated into a CO2 reduction framework involving four domains of climate action for individual countries. The framework reserves a fair emission path for developing countries to proceed with their development by indexing country-dependent reduction rates proportional to the HDI in order to preserve the 2{\deg}C target after a particular development threshold is reached. Under this approach, global cumulative emissions by 2050 are estimated to range from 850 up to 1100Gt of CO2. These values are within the uncertainty range of emissions to limit global temperatures to 2{\deg}C.Comment: 14 pages, 7 figures, 1 tabl

    Missing values: sparse inverse covariance estimation and an extension to sparse regression

    Full text link
    We propose an l1-regularized likelihood method for estimating the inverse covariance matrix in the high-dimensional multivariate normal model in presence of missing data. Our method is based on the assumption that the data are missing at random (MAR) which entails also the completely missing at random case. The implementation of the method is non-trivial as the observed negative log-likelihood generally is a complicated and non-convex function. We propose an efficient EM algorithm for optimization with provable numerical convergence properties. Furthermore, we extend the methodology to handle missing values in a sparse regression context. We demonstrate both methods on simulated and real data.Comment: The final publication is available at http://www.springerlink.co
    • 

    corecore